Importance results in the form of boxplots split across FI method and learner type, separately for each dataset / DGP. Table 1 lists all DGPs used in the importance benchmark.
| Dataset | p | Description | Model |
|---|---|---|---|
| ewald | 5 | DGP used by Ewald et al. (2024) | \(Y = X_4 + X_5 + X_4 X_5 + \varepsilon\) |
| correlated | 4 | \(X_1\) and \(X_2\) correlated (\(r = 0.25, 0.75\)) | \(Y = 2 X_1 + X_3 + \varepsilon\) |
| interactions | 5 | \(X_1\) and \(X_2\) interact but have no direct effects | \(Y = 2 X_1 X_2 + X_3 + \varepsilon\) |
| friedman1 | 10 | Regression benchmark from mlbench | \(Y = 10 \sin(\pi X_1 X_2) + 20 (X_3 - 0.5)^2 + 10 X_4 + 5 X_5 + \varepsilon\) |
| independent | 5 | Uncorrelated features with direct effects | \(Y = 2 X_1 + X_2 + 0.5 X_3 + \varepsilon\) |
| confounded | 4 | Unobserved confounder \(H\) with proxy | \(Y = H + X_1 + \varepsilon\) |
| mediated | 4 | Mediator masks exposure variable | \(Y = 1.5 \cdot \mathrm{mediator} + 0.5 \cdot \mathrm{direct} + \varepsilon\) |
| bike sharing | 12 | Real-world dataset | N/A |
Importances are scaled to percentages such that 100 is the highest importance value assigned by the corresponding method on a dataset in a given replication.
Scaled feature importance scores for the Confounded task by FI method and learner, colored by implementing package.
Scaled feature importance scores for the Ewald task by FI method and learner, colored by implementing package.
Scaled feature importance scores for the Friedman1 task by FI method and learner, colored by implementing package.
Scaled feature importance scores for the Independent task by FI method and learner, colored by implementing package.
Scaled feature importance scores for the Interactions task by FI method and learner, colored by implementing package.
Scaled feature importance scores for the mediated task by FI method and learner, colored by implementing package.
Scaled feature importance scores for the bike sharing task by FI method and learner, colored by implementing package.
Importance scores are converted to ranks with 1 (leftmost) being the highest importance score, i.e., the most important feature as judged by the respective method. Boxplots across ranks illustrate how consistent the rankings are.
Ranked feature importance scores for the Confounded task by FI method and learner, colored by implementing package.
Ranked feature importance scores for the Ewald task by FI method and learner, colored by implementing package.
Ranked feature importance scores for the Friedman1 task by FI method and learner, colored by implementing package.
Ranked feature importance scores for the Independent task by FI method and learner, colored by implementing package.
Ranked feature importance scores for the Interactions task by FI method and learner, colored by implementing package.
Ranked feature importance scores for the mediated task by FI method and learner, colored by implementing package.
Ranked feature importance scores for the bike sharing task by FI method and learner, colored by implementing package.